Stability-Guaranteed Reinforcement Learning for Contact-Rich Manipulation
نویسندگان
چکیده
Reinforcement learning (RL) has had its fair share of success in contact-rich manipulation tasks but it still lags behind benefiting from advances robot control theory such as impedance and stability guarantees. Recently, the concept variable (VIC) was adopted into RL with encouraging results. However, more important issue remains unaddressed. To clarify challenge stable RL, we introduce term all-the-time-stability that unambiguously means every possible rollout should be certified. Our contribution is a model-free method not only adopts VIC also achieves all-the-time-stability. Building on recently proposed controller policy parameterization, novel search algorithm inspired by Cross-Entropy Method inherently guarantees stability. experimental studies confirm feasibility usefulness guarantee features, to best our knowledge, first successful application benchmark problem peg-in-hole.
منابع مشابه
Composable Deep Reinforcement Learning for Robotic Manipulation
Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained usi...
متن کاملDeep Reinforcement Learning for Robotic Manipulation
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered ...
متن کاملPAC Reinforcement Learning with Rich Observations
We propose and study a new model for reinforcement learning with rich observations, generalizing contextual bandits to sequential decision making. These models require an agent to take actions based on observations (features) with the goal of achieving long-term performance competitive with a large set of policies. To avoid barriers to sample-efficient learning associated with large observation...
متن کاملReinforcement Learning for Appearance Based Visual Servoing in Robotic Manipulation
The objective of this paper is to develop a new appearance based visual servoing method that needs no prior structuring of the environment and also eliminates the correspondence problem associated with conventional visual servoing methods. Detailed description of object appearance and its generation are provided in this paper. In addition, owing to the non-linear and high dimensional nature of ...
متن کاملTime manipulation technique for speeding up reinforcement learning in simulations
A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2021
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2020.3028529